839 research outputs found
Prediction of Atomization Energy Using Graph Kernel and Active Learning
Data-driven prediction of molecular properties presents unique challenges to
the design of machine learning methods concerning data
structure/dimensionality, symmetry adaption, and confidence management. In this
paper, we present a kernel-based pipeline that can learn and predict the
atomization energy of molecules with high accuracy. The framework employs
Gaussian process regression to perform predictions based on the similarity
between molecules, which is computed using the marginalized graph kernel. To
apply the marginalized graph kernel, a spatial adjacency rule is first employed
to convert molecules into graphs whose vertices and edges are labeled by
elements and interatomic distances, respectively. We then derive formulas for
the efficient evaluation of the kernel. Specific functional components for the
marginalized graph kernel are proposed, while the effect of the associated
hyperparameters on accuracy and predictive confidence are examined. We show
that the graph kernel is particularly suitable for predicting extensive
properties because its convolutional structure coincides with that of the
covariance formula between sums of random variables. Using an active learning
procedure, we demonstrate that the proposed method can achieve a mean absolute
error of 0.62 +- 0.01 kcal/mol using as few as 2000 training samples on the QM7
data set
Nonlinear Matrix Approximation with Radial Basis Function Components
We introduce and investigate matrix approximation by decomposition into a sum
of radial basis function (RBF) components. An RBF component is a generalization
of the outer product between a pair of vectors, where an RBF function replaces
the scalar multiplication between individual vector elements. Even though the
RBF functions are positive definite, the summation across components is not
restricted to convex combinations and allows us to compute the decomposition
for any real matrix that is not necessarily symmetric or positive definite. We
formulate the problem of seeking such a decomposition as an optimization
problem with a nonlinear and non-convex loss function. Several modern versions
of the gradient descent method, including their scalable stochastic
counterparts, are used to solve this problem. We provide extensive empirical
evidence of the effectiveness of the RBF decomposition and that of the
gradient-based fitting algorithm. While being conceptually motivated by
singular value decomposition (SVD), our proposed nonlinear counterpart
outperforms SVD by drastically reducing the memory required to approximate a
data matrix with the same L2 error for a wide range of matrix types. For
example, it leads to 2 to 6 times memory save for Gaussian noise, graph
adjacency matrices, and kernel matrices. Moreover, this proximity-based
decomposition can offer additional interpretability in applications that
involve, e.g., capturing the inner low-dimensional structure of the data,
retaining graph connectivity structure, and preserving the acutance of images
Learning Stochastic Dynamics with Statistics-Informed Neural Network
We introduce a machine-learning framework named statistics-informed neural
network (SINN) for learning stochastic dynamics from data. This new
architecture was theoretically inspired by a universal approximation theorem
for stochastic systems, which we introduce in this paper, and the
projection-operator formalism for stochastic modeling. We devise mechanisms for
training the neural network model to reproduce the correct \emph{statistical}
behavior of a target stochastic process. Numerical simulation results
demonstrate that a well-trained SINN can reliably approximate both Markovian
and non-Markovian stochastic dynamics. We demonstrate the applicability of SINN
to coarse-graining problems and the modeling of transition dynamics.
Furthermore, we show that the obtained reduced-order model can be trained on
temporally coarse-grained data and hence is well suited for rare-event
simulations
Learning Recurrent ANFIS Using Stochastic Pattern Search Method
Summary Pattern search learning is known for simplicity and faster convergence. However, one of the downfalls of this learning is the premature convergence problem. In this paper, we show how we can avoid the possibility of being trapped in a local pit by the introduction of stochastic value. This improved pattern search is then applied on a recurrent type neuro-fuzzy network (ANFIS) to solve time series prediction. Comparison with other method shows the effectiveness of the proposed method for this problem
Detecting Label Noise via Leave-One-Out Cross-Validation
We present a simple algorithm for identifying and correcting real-valued
noisy labels from a mixture of clean and corrupted sample points using Gaussian
process regression. A heteroscedastic noise model is employed, in which
additive Gaussian noise terms with independent variances are associated with
each and all of the observed labels. Optimizing the noise model using maximum
likelihood estimation leads to the containment of the GPR model's predictive
error by the posterior standard deviation in leave-one-out cross-validation. A
multiplicative update scheme is proposed for solving the maximum likelihood
estimation problem under non-negative constraints. While we provide proof of
convergence for certain special cases, the multiplicative scheme has
empirically demonstrated monotonic convergence behavior in virtually all our
numerical experiments. We show that the presented method can pinpoint corrupted
sample points and lead to better regression models when trained on synthetic
and real-world scientific data sets
- …